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cappa B sequence on step 63 of original application was obtained form HIV-1 
^(nt 350-374, nt 9435- 9458) as shovm on pages 8 and 11 of the sequence listing. 
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TITLE 
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HIVHXB2CG 9719 bp SS-RNA VRL 19-AUG-1999 

Human immunodeficiency virus type 1 (HXB2) , complete genome; 
HIVl/HTLV-III/LAV reference genome. 
K03455 M38432 
K03455.1 GI:1906382 

TAR protein; acquired immune deficiency syndrome; complete genome; 
env protein; gag protein; long terminal repeat (LTR) ; pol protein; 
polyprotein; proviral gene; reverse transcriptase; transactivator . 
Human immunodeficiency virus type 1. 
Human immunodeficiency virus type 1 

Viruses; Retroid viruses; Retroviridae; Lentivirus; Primate 
lentivirus group. 

1 (bases 493 to 674; 9577 to 9718) 

Ratner,L., Haseltine, W. , Patarca,R., Livak,K.J., Starcich,B., 
Josephs , S . F . , Doran,E.R., Raf alski , J. A. , Whitehorn, E . A. , 
Baumeister, K. , Ivanoff,L., Petteway, S . R . Jr., Pearson, M. L , , 
Lautenberger, J.A. , Papas, T.S., Ghrayeb,J., Chang, N.T., Gallo,R.C. 
and Wong-Staal , F. 

Complete nucleotide sequence of the AIDS virus, HTLV-III 

Nature 313 (6000) , 277-284 (1985) 

85111123 

2578615 

2 (bases 1 to 653) 

Starcich,B., Ratner,L., Josephs , S . F . , Okamoto,T., Gallo,R.C. and 
Wong-Staal , F . 

Characterization of long terminal repeat sequences of HTLV-III 

Science 227 (4686), 538-540 (1985) 

85090465 

3 (sites) 

Allan, J. S., Coligan, J.E. , Barin,F., McLane,M.F., Sodroski , J. G . , 
Rosen, C. A., Haseltine, W. A. , Lee,T.H. and Essex, M. 

Major glycoprotein antigens that induce antibodies in AIDS patients 

are encoded by HTLV-III 

Science 228 (4703), 1091-1094 (1985) 

85192537 

4 (sites) 

Sodroski, J., Patarca,R., Rosen, C, Wong-Staal , F . and Haseltine, W. 
Location of the trans -activating region on the genome of human 
T-cell lymphotropic virus type III 
Science 229 (4708), 74-77 (1985) 
85244627 

5 (sites) 

Arya,S.K., Guo,C., Josephs, S.F. and Wong-Staal , F . 
Trans -activator gene of human T- lymphotropic virus type III 
(HTLV-III) 

Science 229 (4708), 69-73 (1985) 
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85244626 

6 (sites) 

van Beveren, C. P. , Coffin, J, and Hughes, S. 
Appendix B: HTLV-3/LAV genome 

(in) Weiss, R.L., Teich,N., Varmus,H. and Coffin, J. (Eds.); 
RNA TUMOR VIRUSES, SECOND EDITION, 2, Vol. 2: 1102-1123; 
Cold Spring Harbor Laboratory, Cold Spring Harbor (1985) 

7 (sites) 

Rosen, C. A., Sodroski , J . G . and Haseltine , W . A. 

The location of cis -acting regulatory sequences in the human T cell 
lymphotropic virus type III (HTLV-III/LAV) long terminal repeat 
Cell 41 (3), 813-823 (1985) 
85228232 

8 (sites) 

Rabson,A.B. , Daugherty , D. F. , Venkatesan, S . , Boulukos , K. E . , 
Benn,S.I., Folks, T.M., Feorino,P. and Martin, M. A. 
Transcription of novel open reading frames of AIDS retrovirus 
during infection of lymphocytes 
Science 229 (4720), 1388-1390 (1985) 
85300515 

9 (sites) 

Allan, J. S., Coligan, J.E. , Lee,T.H., McLane,M.F., Kanki,P.J., 
Groopman, J. E . and Essex, M. 

A new HTLV-III/LAV encoded antigen detected by antibodies from AIDS 
patients 

Science 230 (4727), 810-813 (1985) 
86044509 

10 (sites) 

Rosen, C. A., Sodroski, J. G. , Goh,W.C., Dayton, A. I., Lippke,J. and 
Haseltine, W. A. 

Post- transcriptional regulation accounts for the trans -activation 
of the human T- lymphotropic virus type III 
Nature 319 (6054), 555-559 (1986) 
86118720 

11 (sites) 

di Marzo Veronese, F., Copeland, T.D. , DeVico,A.L., Rahman, R., 
Oroszlan,S., Gallo,R.C. and Sarngadharan, M. G . 

Characterization of highly immunogenic p66/p51 as the reverse 
transcriptase of HTLV-III/LAV 
Science 231 (4743), 1289-1291 (1986) 
86122937 

12 (sites) 

Kan,N.C., Franchini , G . , Wong-Staal , F . , DuBois,G.C., Robey,W.G., 
Lautenberger, J.A. and Papas, T.S. 

Identification of HTLV-III/LAV sor gene product and detection of 

antibodies in human sera 

Science 231 (4745), 1553-1555 (1986) 

86151663 

13 (sites) 

Kramer, R. A., Schaber , M . D . , Skalka,A.M. , Ganguly, K., Wong-Staal , F . 
and Reddy, E . P . 

HTLV-III gag protein is processed in yeast cells by the virus 
pol -protease 

Science 231 (4745), 1580-1584 (1986) 
86151671 

14 (sites) 

Lee,T.H., Coligan, J. E . , Allan,J.S., McLane,M.F., Groopman, J . E . and 
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Essex, M . 

A new HTLV-III/LAV protein encoded by a gene found in cytopathic 
retroviruses 

Science 231 (4745) , 1546-1549 (1986) 
86151661 

15 (sites) 

Dayton, A. I., Sodroski, J.G. , Rosen, C. A., Goh,W.C. and Haseltine,W.A. 
The trans -activator gene of the human T cell lymphotropic virus 
type III is required for replication 
Cell 44 (6) , 941-947 (1986) 
86161683 

16 (sites) 

Sodroski, J., Goh,W.C., Rosen, C, Tartar, A., Portetelle, D. , Burny,A. 
and Haseltine,W. 

Replicative and cytopathic potential of HTLV-III/LAV with sor gene 
deletions 

Science 231 (4745), 1549-1553 (1986) 
86151662 

17 (sites) 

Arya,S.K. and Gallo,R.C. 

Three novel genes of human T- lymphotropic virus type III: immune 
reactivity of their products with sera from acquired immune 
deficiency syndrome patients 

Proc. Natl. Acad. Sci. U.S.A. 83 (7), 2209-2213 (1986) 
86177573 

18 (sites) 

Jones, K. A., Kadonaga , J . T . , Luciw,P.A. and Tjian,R. 

Activation of the AIDS retrovirus promoter by the cellular 

transcription factor, Spl 

Science 232 (4751), 755-759 (1986) 

86179897 

19 (sites) 

Sodroski, J., Goh,W.C., Rosen, C, Dayton, A., Terwilliger , E . and 
Haseltine, W. 

A second post -transcriptional trans -activator gene required for 

HTLV-III replication 

Nature 321 (6068), 412-417 (1986) 

86230863 

20 (sites) 

Starcich,B.R. , Hahn,B.H., Shaw,G.M., McNeely , P . D . , Modrow,S., 
Wolf,H., Parks, E.S., Parks, W. P., Josephs , S . F . , Gallo,R.C. and 
Wong-Staal, F. 

Identification and characterization of conserved and variable 
regions in the envelope gene of HTLV-III/LAV, the retrovirus of 
AIDS 

Cell 45 (5) , 637-648 (1986) 
86218077 

21 (sites) 

Willey,R.L., Rutledge,R. A. , Dias,S., Folks, T., Theodore, T., 
Buckler, C.E. and Martin, M. A. 

Identification of conserved and divergent domains within the 
envelope gene of the acquired immunodeficiency syndrome retrovirus 
Proc. Natl. Acad. Sci. U.S.A. 83 (14), 5038-5042 (1986) 
86259728 

22 (bases 8761 to 9060) 

Fisher, A. G., Ratner,L., Mitsuya,H., Marselle, L.M. , Harper, M.E., 
Broder,S., Gallo,R.C. and Wong-Staal , F . 
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Infectious mutants of HTLV-III with changes in the 3' region and 
markedly reduced cytopathic effects 
Science 233 (4764), 655-659 (1986) 
86261824 

23 (sites) 

Feinberg,M.B. , Jarrett , R . F . , Aldovini,A., Gallo,R.C. and 
Wong-Staal, F. 

HTLV-III expression and production involve complex regulation at 
the levels of splicing and translation of viral RNA 
Cell 46 (6), 807-817 (1986) 
87002448 

24 (sites) 

Lightfoote,M.M. , Coligan, J. E . , Folks, T.M., Fauci, A. S., Martin, M. A. 
and Venkatesan, S . 

Structural characterization of reverse transcriptase and 
endonuclease polypeptides of the acquired immunodeficiency syndrome 
retrovirus 

J. Virol. 60 (2), 771-775 (1986) 
87036947 

25 (sites) 

Wright, CM., Felber,B.K., Paskalis,H. and Pavlakis,G.N. 
Expression and characterization of the trans-activator of 
HTLV-III/LAV virus 

Science 234 (4779), 988-992 (1986) 
87042788 

26 (sites) 

Terwilliger,E. , Sodroski , J . G. , Rosen, C. A. and Haseltine , W . A, 
Effects of mutations within the 3 • orf open reading frame region of 
human T-cell lymphotropic virus type III (HTLV- III/LAV) on 
replication and cytopathogenicity 
J. Virol. 60 (2), 754-760 (1986) 
87036943 

27 (sites) 

Goh,W.C., Sodroski, J. G. , Rosen, C. A. and Haseltine , W .A. 
Expression of the art gene protein of human T- lymphotropic virus 
type III (HTLV-III/LAV) in bacteria 
J. Virol. 61 (2), 633-637 (1987) 
87112968 

28 (sites) 

Modrow,S., Hahn,B.H., Shaw,G.M., Gallo,R.C., Wong-Staal , F . and 
Wolf ,H. 

Computer-assisted analysis of envelope protein sequences of seven 
human immunodeficiency virus isolates: prediction of antigenic 
epitopes in conserved and variable regions 
J. Virol. 61 (2), 570-578 (1987) 
87112954 

29 (sites) 

Muesing,M.A. , Smith, D.H. and Capon, D.J. 

Regulation of mRNA accumulation by a human immunodeficiency virus 
trans-activator protein 
Cell 48 (4), 691-701 (1987) 
87131081 

30 (sites) 

Nabel,G. and Baltimore, D. 

An inducible transcription factor activates expression of human 
immunodeficiency virus in T cells 
Nature 326 (6114), 711-713 (1987) 
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87173065 

Erratum: [Nature 1990 Mar 8 ; 344 (6262 ): 178] 

31 (sites) 

Fisher, A. G., Ensoli,B., Ivanoff,L., Chamberlain, M . , Petteway,S., 
Ratner,L., Gallo,R.C. and Wong-Staal , F . 

The sor gene of HIV-1 is required for efficient virus transmission 
in vitro 

Science 237 (4817) , 888-893 (1987) 
87292118 

32 (sites) 

Patarca,R., Heath, C, Goldenberg,G. J. , Rosen, C. A., Sodroski , J.G . , 
Haseltine, W. A. and Hansen, U.M. 

Transcription directed by the HIV long terminal repeat in vitro 

AIDS Res. Hum. Retroviruses 3 (1), 41-55 (1987) 

87299195 

33 (sites) 

Wong-Staal, F. , Chanda,P.K. and Ghrayeb,J. 
Human immunodeficiency virus; the eighth gene 
AIDS Res. Hum. Retroviruses 3 (1), 33-39 (1987) 
87299194 

34 (bases 1 to 9635/ 1 to 9635) 

Ratner,L., Fisher, A., Jagodzinski , L . L . , Mitsuya,H., Liou,R.S., 
Gallo,R.C. and Wong-Staal , F . 

Complete nucleotide sequences of functional clones of the AIDS 
virus 

AIDS Res. Hum. Retroviruses 3 (1), 57-69 (1987) 
87299196 

35 (bases 6225 to 8795) 

Reitz,M.S. Jr., Wilson, C, Naugle,C., Gallo,R.C. and 
Robert -Guroff , M . 

Generation of a neutralization-resistant variant of HIV-1 is due to 
selection for a point mutation in the envelope gene 
Cell 54 (1) , 57-63 (1988) 
88253426 

36 (bases 790 to 2292) 

Pal,R., Reitz,M.S. Jr., Tschachler , E . , Gallo,R.C., 
Sarngadharan,M.G. and Veronese , F .D . 

Myristoylation of gag proteins of HIV-1 plays an important role in 
virus assembly 

AIDS Res. Hum. Retroviruses 6 (6), 721-730 (1990) 
90303964 

37 (sites) 

Ido,E., Han,H.P., Kezdy,F.J. and Tang, J. 

Kinetic studies of human immunodeficiency virus type 1 protease and 
its active -site hydrogen bond mutant A2 8S 
J. Biol. Chem. 266 (36), 24359-24366 (1991) 
92105089 

On Mar 25, 1997 this sequence version replaced gi: 327742. 
[6] sites; tat mRNA and other transcript boundaries. [7] sites; 
tat mRNA. 

[8] sites; mRNA splice sites. 
[9] sites; 27K antigen cds . 

[5] sites; gpl60 and gpl20 coding sequences. 
[1] sites; regulatory sequences in the LTR. 
[(in) Weiss, R., Teich,N., Varmus,H. and Coffin, J. 
Viruses, Secon] review; bases 1 to 9718. 

[15] sites; trans -activator function and TAR sequence. [19] 



(Eds . ) ;RNA Tumor 
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sites; pol coding sequence. 

[22] sites; 23K sor gene product. 

[23] sites; pol NH2 -terminal region. 

[20] sites; sor 23K protein. 

[21] sites; sor 23K protein. 

[24] sites; Spl binding sites in tlie promoter region. [17] sites; 
acceptor and donor splice sites for tat and 27K. [10] sites; 
deletion mutants in the tat gene. 



r T o 1 

[18 J 


sites, 


env gene conserved/varable regions; separate entries. 


[16] 


sites, 


trs cds boundaries . 


[12] 


sites , 


trs cds boundaries . 


[11] 


sites , 


env gene conserved/variable regions; separate entries 


[26] 


sites , 


' tar or transactivator target. 


[13] 


sites , 


' 3* orf mutations. 


[14] 


sites, 


• pol p34 terminus. 


[31] 


sites , 


' promoter, TAR, tat-III mutants. 


[32] 


sites, 


' envelope protein epitopes. 


[33] 


sites, 


' trs/art protein. 


[34] 


sites , 


• inducible enliancer element. 


[27] 


revises [30] . 


[29] 


sites , 


• long terminal repeat . 


[28] 


sites , 


• R orf. 


[35] 


sites , 


• sor. 



Sequence for [25] Jcindly provided in computer -readable form by 
L.Ratner, 19-AUG-1986 . 

The HXB2 sequence is being used as a reference genome for all ttie 
HIV entries because it has been derived from a demonstrably 
infectious clone. Hence not all of the 'sites' references above 
were concerned with this isolate. 
FEATURES Location/Qualifiers 
source 1 . . 9719 

/organism= "Human immunodeficiency virus type 1" 

/proviral 

/isolate="HXB2" 

/db_xref="taxon: 11676" 

/note="HTLV-IIl/LAV" 
LTR 1. .634 

/note="5' LTR" 
repeat_region 454.. 551 

/note="R repeat 5' copy" 
mRNA 455 . . 9635 

/product="HXB2 genomic mRNA" 
prim_transcript 455.. 9635 

/note="tat, trs, 2 7K subgenomic mRNA" 
intron 744.. 5777 

/note="tat, trs, 2 7K mRNA intron 1" 
CDS 790. .2292 

/note="gag polyprotein" 

/ codon_s tart =1 

/protein_id="AAB50258 . 1" 

/db_xref="GI : 327745" 

/translation="MGARASVLSGGELDRWEKIRLRPGGKKKyKLKHIVWASRELERF 
AVNPGLLETS EGCRQ I LGQLQPS LQTGS EE LRS L YNTVATL YCVHQR I E I KDTKEALD 
KI EEEQNKS KKKAQQAAADTGHSNQVSQNYP I VQN I QGQMVHQAI S PRTLNAWVKWE 
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPV 
HAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRM 
YSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTIL 
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KALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKC 
FNCGKEGHTARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSYKGRPGNFLQ 
SRPEPTAPPEESFRSGVETTTPPQKQEPIDKELYPLTSLRSLFGNDPSSQ" 
2358 . . 5096 

/note="pol polyprotein (NH2-terminus uncertain)" 
/ codon_start=l 
/protein_id= "AAB50259 . 1" 
/db_xref="GI : 1906384" 

/translation="MSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGP 

TPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEIC 

TEMEKEGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPH 

PAGLKKKKS VTVLDVGDAYFS VPLDEDFRKYTAFTI PS INNETPGIRYQYNVLPQGWK 

GSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRW 

GLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQI 

YPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAE 

IQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKT 

PKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDG 

AANRETKLGKAGYVTNRGRQKWTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQYA 

LGIIQAQPDQSESELVNQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKVL 

FLDGIDKAQDEHEKYHSNWRAMASDFNLPPWAKEIVASCDKCQLKGEAMHGQVDCSP 

GIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTD 

NGSNFTGATVRAACWWAGIKQEFGIPYNPQSQGWESMNKELKKIIGQVRDQAEHLKT 

AVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKELQKQITKIQNFRVYYRDSRN 

PLWKGPAKLLWKGEGAWIQDNSDIKWPRRKAKI IRDYGKQMAGDDCVASRQDED " 

5041. .5619 

/note="sor 23K protein" 
/ codon_start=l 
/protein_id="AAB50260 . 1" 
/db_xref ="GI:327747" 

/trans lation="MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHY 
ESPHPRISSEVHIPLGDARLVITTYWGLHTGERDWHLGQGVSIEWRKKRYSTQVDPEL 
ADQLIHLYYFDCFSDSAIRKALLGHIVSPRCEYQAGHNKVGSLQYLALAALITPKKIK 
PPLPS VTKLTEDRWNKPQKTKGHRGSHTMNGH " 
5559. .5795 

/note=:"R (ORF) protein" 
/codon_start=l 
/protein_id= "AAB50261 . 1" 
/db_xref="GI : 327748" 

/translation="MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQ 

HI YETYGDTWAGVEAI IRILQQLLFIHFQNWVST " 

join(5831. .6045,8379. .8424) 

/note="tat protein" 

/ codon_s tart = 1 

/protein_id= "AAB50256. 1" 

/db_xref="GI : 1906383" 

/translation="MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALG 

ISYGRKKRRQRRRAHQNSQTHQASLSKQPTSQPRGDPTGPKE" 

5831. .6045 

/note="tat protein, first expressed exon" 
/number=2 

join (5 970. .604 5,8379. .8653) 
/note="trs protein" 
/codon_start=l 
/protein_id="AAB50257 . 1" 
/db_xref ="GI : 327744 " 

/translation="MAGRSGDSDEELIRTVRLIKLLYQSNPPPNPEGTRQARRNRRRR 
WRERQRQIHSISERILGTYLGRSAEPVPLQLPPLERLTLDCNEDCGTSGTQGVGSPQI 



LVESPTVIiESGTKE" 
5970 . . 6045 

/note="trs protein, first expressed exon" 
/number=2 
6046 . . 8378 

/note="tat, trs, 27K mRNA intron 2" 
6225. .8795 

/note=" envelope polyprotein" 
/ codon_start=l 
/protein_id="AAB50262 .1" 
/db_xref="GI : 1906385" 

/translation=" MRVKEKYQHLWRWGWRWGTMLLGMLM I CS ATEKLWVTVYYGVPV 
WKEATTTLFCASDAKAYDTEVHNWATHACVPTDPNPQEVVLVNVTENFNMWKiro]^ 
QMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFN 
ISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYKLTSCNTSVITQACPKVSFEPIPIHYC 
APAGFAILKCNNKTFNGTGPCTNVSTVQCTHGIRPWSTQLLLNGSLAEEEWIRSVN 
FTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCNIS 
RAKWNNTLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFN 
STWFNSTWSTEGSNNTEGSDTITLPCRIKQIINMWQKVGKAMYAPPISGQIRCSSNIT 
GLLLTRDGGNSNNESEIFRPGGGDMRDNWRSELYKYKWKIEPLGVAPTKAKRRWQR 
EKRAVGIGALFLGFLGAAGSTMGAASMTLTVQARQLLSGIVQQQNNLLRAIEAQQHLL 
QLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQIWN 
HTTWMEWDREINNYTSLIHSLIEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYI 
KLFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPLSFQTHLPTPRGPDRPEGIEEEGGER 
DRDRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWN 

LLQYWSQELKNSAVSLLNATAIAVAEGTDRVIEWQGACRAIRHIPRRIRQGLERILL 
If 

axon 8379.-8652 

/note="trs protein" 

/number=3 
exon 8379. .8424 

/note="tat protein" 

/number=3 
CDS 8797 . . 9168 

/note="27K protein (premature termination) " 

/codon_start=l 

/protein_id="AAB50263 .1" 

/db_xref="GI: 1906386" 

/translation="MGGKWSKSSVIGWPTVRERMRRAEPAADRVGAASRDLEKHGAIT 

SSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIH 

SQRRQD I LDLW I YHTQG YF PD " 
LTR 9086 . . 9719 

/note="3' LTR" 
repeat_region 9540.. 9636 

/note="R repeat 3' copy" 
polyA_signal 9612 . . 9617 

/note="HXB2 mRNA polyadenyation signal" 
BASE COUNT 3411 a 1772 c 2373 g 2163 t 

ORIGIN 

1 tggaagggct aattcactcc caacgaagac aagatatcct tgatctgtgg atctaccaca 
61 cacaaggcta cttccctgat tagcagaact acacaccagg gccagggatc agatatccac 
121 tgacctttgg atggtgctac aagctagtac cagttgagcc agagaagtta gaagaagcca 
181 acaaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatggaatg gatgacccgg 
241 agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac atggcccgag 
301 agctgcatcc ggagtacttc aagaactgct gacatcgagc ttgctacaag ggactttccg 
3 61 ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga gccctcagat 
421 cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga ccagatctga 



exon 

intron 
CDS 
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481 gcctgggagc tctctggcta actagggaac ccactgctta agcctcaata aagcttgcct 

541 tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta gagatccctc 

601 agaccctttt agtcagtgtg gaaaatctct agcagtggcg cccgaacagg gacctgaaag 

661 cgaaagggaa accagaggag ctctctcgac gcaggactcg gcttgctgaa gcgcgcacgg 

721 caagaggcga ggggcggcga ctggtgagta cgccaaaaat tttgactagc ggaggctaga 

781 aggagagaga tgggtgcgag agcgtcagta ttaagcgggg gagaattaga tcgatgggaa 

841 aaaattcggt taaggccagg gggaaagaaa aaatataaat taaaacatat agtatgggca 

901 agcagggagc tagaacgatt cgcagttaat cctggcctgt tagaaacatc agaaggctgt 

961 agacaaatac tgggacagct acaaccatcc cttcagacag gatcagaaga acttagatca 

1021 ttatataata cagtagcaac cctctattgt gtgcatcaaa ggatagagat aaaagacacc 

1081 aaggaagctt tagacaagat agaggaagag caaaacaaaa gtaagaaaaa agcacagcaa 

1141 gcagcagctg acacaggaca cagcaatcag gtcagccaaa attaccctat agtgcagaac 

12 01 atccaggggc aaatggtaca tcaggccata tcacctagaa ctttaaatgc atgggtaaaa 
1261 gtagtagaag agaaggcttt cagcccagaa gtgataccca tgttttcagc attatcagaa 
1321 ggagccaccc cacaagattt aaacaccatg ctaaacacag tggggggaca tcaagcagcc 

13 81 atgcaaatgt taaaagagac catcaatgag gaagctgcag aatgggatag agtgcatcca 
1441 gtgcatgcag ggcctattgc accaggccag atgagagaac caaggggaag tgacatagca 
1501 ggaactacta gtacccttca ggaacaaata ggatggatga caaataatcc acctatccca 
1561 gtaggagaaa tttataaaag atggataatc ctgggattaa ataaaatagt aagaatgtat 
1621 agccctacca gcattctgga cataagacaa ggaccaaagg aaccctttag agactatgta 
1681 gaccggttct ataaaactct aagagccgag caagcttcac aggaggtaaa aaattggatg 
1741 acagaaacct tgttggtcca aaatgcgaac ccagattgta agactatttt aaaagcattg 
1801 ggaccagcgg ctacactaga agaaatgatg acagcatgtc agggagtagg aggacccggc 
1861 cataaggcaa gagttttggc tgaagcaatg agccaagtaa caaattcagc taccataatg 
1921 atgcagagag gcaattttag gaaccaaaga aagattgtta agtgtttcaa ttgtggcaaa 
1981 gaagggcaca cagccagaaa ttgcagggcc cctaggaaaa agggctgttg gaaatgtgga 
2041 aaggaaggac accaaatgaa agattgtact gagagacagg ctaatttttt agggaagatc 
2101 tggccttcct acaagggaag gccagggaat tttcttcaga gcagaccaga gccaacagcc 
2161 ccaccagaag agagcttcag gtctggggta gagacaacaa ctccccctca gaagcaggag 
2221 ccgatagaca aggaactgta tcctttaact tccctcaggt cactctttgg caacgacccc 
22 81 tcgtcacaat aaagataggg gggcaactaa aggaagctct attagataca ggagcagatg 
2341 atacagtatt agaagaaatg agtttgccag gaagatggaa accaaaaatg atagggggaa 
24 01 ttggaggttt tatcaaagta agacagtatg atcagatact catagaaatc tgtggacata 
2461 aagctatagg tacagtatta gtaggaccta cacctgtcaa cataattgga agaaatctgt 
2521 tgactcagat tggttgcact ttaaattttc ccattagccc tattgagact gtaccagtaa 
2581 aattaaagcc aggaatggat ggcccaaaag ttaaacaatg gccattgaca gaagaaaaaa 
2641 taaaagcatt agtagaaatt tgtacagaga tggaaaagga agggaaaatt tcaaaaattg 
2701 ggcctgaaaa tccatacaat actccagtat ttgccataaa gaaaaaagac agtactaaat 
2761 ggagaaaatt agtagatttc agagaactta ataagagaac tcaagacttc tgggaagttc 
2 821 aattaggaat accacatccc gcagggttaa aaaagaaaaa atcagtaaca gtactggatg 
2881 tgggtgatgc atatttttca gttcccttag atgaagactt caggaagtat actgcattta 
2941 ccatacctag tataaacaat gagacaccag ggattagata tcagtacaat gtgcttccac 
3001 agggatggaa aggatcacca gcaatattcc aaagtagcat gacaaaaatc ttagagcctt 
3061 ttagaaaaca aaatccagac atagttatct atcaatacat ggatgatttg tatgtaggat 
3121 ctgacttaga aatagggcag catagaacaa aaatagagga gctgagacaa catctgttga 
3181 ggtggggact taccacacca gacaaaaaac atcagaaaga acctccattc ctttggatgg 
3241 gttatgaact ccatcctgat aaatggacag tacagcctat agtgctgcca gaaaaagaca 
3301 gctggactgt caatgacata cagaagttag tggggaaatt gaattgggca agtcagattt 

33 61 acccagggat taaagtaagg caattatgta aactccttag aggaaccaaa gcactaacag 
3421 aagtaatacc actaacagaa gaagcagagc tagaactggc agaaaacaga gagattctaa 

34 81 aagaaccagt acatggagtg tattatgacc catcaaaaga cttaatagca gaaatacaga 
3541 agcaggggca aggccaatgg acatatcaaa tttatcaaga gccatttaaa aatctgaaaa 
3601 caggaaaata tgcaagaatg aggggtgccc acactaatga tgtaaaacaa ttaacagagg 
3661 cagtgcaaaa aataaccaca gaaagcatag taatatgggg aaagactcct aaatttaaac 
3721 tgcccataca aaaggaaaca tgggaaacat ggtggacaga gtattggcaa gccacctgga 
3781 ttcctgagtg ggagtttgtt aatacccctc ccttagtgaa attatggtac cagttagaga 
3841 aagaacccat agtaggagca gaaaccttct atgtagatgg ggcagctaac agggagacta 
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3901 aattaggaaa agcaggatat gttactaata 

3 961 acacaacaaa tcagaagact gagttacaag 
4021 tagaagtaaa catagtaaca gactcacaat 
4081 atcaaagtga atcagagtta gtcaatcaaa 
4141 tctatctggc atgggtacca gcacacaaag 
4201 tagtcagtgc tggaatcagg aaagtactat 
4261 aacatgagaa atatcacagt aattggagag 
4321 tagtagcaaa agaaatagta gccagctgtg 
43 81 atggacaagt agactgtagt ccaggaatat 
4441 aagttatcct ggtagcagtt catgtagcca 
4501 cagaaacagg gcaggaaaca gcatattttc 
4561 aaacaataca tactgacaat ggcagcaatt 
4621 ggtgggcggg aatcaagcag gaatttggaa 
4681 tagaatctat gaataaagaa ttaaagaaaa 
4741 atcttaagac agcagtacaa atggcagtat 

4 801 ttggggggta cagtgcaggg gaaagaatag 
4861 aagaattaca aaaacaaatt acaaaaattc 
4921 gaaatccact ttggaaagga ccagcaaagc 
4981 tacaagataa tagtgacata aaagtagtgc 
5041 atggaaaaca gatggcaggt gatgattgtg 
5101 tggaaaagtt tagtaaaaca ccatatgtat 
5161 agacatcact atgaaagccc tcatccaaga 
5221 gatgctagat tggtaataac aacatattgg 
5281 ttgggtcagg gagtctccat agaatggagg 
5341 gaactagcag accaactaat tcatctgtat 
5401 agaaaggcct tattaggaca catagttagc 
5461 aaggtaggat ctctacaata cttggcacta 
5521 ccacctttgc ctagtgttac gaaactgaca 
5581 aagggccaca gagggagcca cacaatgaat 
5641 atgaagctgt tagacatttt cctaggattt 
5701 aaacttatgg ggatacttgg gcaggagtgg 
5761 tgtttatcca ttttcagaat tgggtgtcga 
5821 agagcaagaa atggagccag tagatcctag 
5881 gcctaaaact gcttgtacca attgctattg 
5941 tttcataaca aaagccttag gcatctccta 
6001 agctcatcag aacagtcaga ctcatcaagc 
6061 aacgcaacct ataccaatag tagcaatagt 
6121 agttgtgtgg tccatagtaa tcatagaata 
6181 caggttaatt gatagactaa tagaaagagc 
6241 aatatcagca cttgtggaga tgggggtgga 
6301 tgatctgtag tgctacagaa aaattgtggg 
6361 aggaagcaac caccactcta ttttgtgcat 
6421 ataatgtttg ggccacacat gcctgtgtac 
6481 tggtaaatgt gacagaaaat tttaacatgt 
6541 aggatataat cagtttatgg gatcaaagcc 
6601 gtgttagttt aaagtgcact gatttgaaga 
6661 gaatgataat ggagaaagga gagataaaaa 
6721 gaggtaaggt gcagaaagaa tatgcatttt 
6781 atgatactac cagctataag ttgacaagtt 
6841 caaaggtatc ctttgagcca attcccatac 
6901 taaaatgtaa taataagacg ttcaatggaa 
6961 aatgtacaca tggaattagg ccagtagtat 
7021 cagaagaaga ggtagtaatt agatctgtca 
7081 tacagctgaa cacatctgta gaaattaatt 
7141 gaatccgtat ccagagagga ccagggagag 
7201 tgagacaagc acattgtaac attagtagag 
7261 ctagcaaatt aagagaacaa tttggaaata 



gaggaagaca aaaagttgtc accctaactg 
caatttatct agctttgcag gattcgggat 
atgcattagg aatcattcaa gcacaaccag 
taatagagca gttaataaaa aaggaaaagg 
gaattggagg aaatgaacaa gtagataaat 
ttttagatgg aatagataag gcccaagatg 
caatggctag tgattttaac ctgccacctg 
ataaatgtca gctaaaagga gaagccatgc 
ggcaactaga ttgtacacat ttagaaggaa 
gtggatatat agaagcagaa gttattccag 
ttttaaaatt agcaggaaga tggccagtaa 
tcaccggtgc tacggttagg gccgcctgtt 
ttccctacaa tccccaaagt caaggagtag 
ttataggaca ggtaagagat caggctgaac 
tcatccacaa ttttaaaaga aaagggggga 
tagacataat agcaacagac atacaaacta 
aaaattttcg ggtttattac agggacagca 
tcctctggaa aggtgaaggg gcagtagtaa 
caagaagaaa agcaaagatc attagggatt 
tggcaagtag acaggatgag gattagaaca 
gtttcaggga aagctagggg atggttttat 
ataagttcag aagtacacat cccactaggg 
ggtctgcata caggagaaag agactggcat 
aaaaagagat atagcacaca agtagaccct 
tactttgact gtttttcaga ctctgctata 
cctaggtgtg aatatcaagc aggacataac 
gcagcattaa taacaccaaa aaagataaag 
gaggatagat ggaacaagcc ccagaagacc 
ggacactaga gcttttagag gagcttaaga 
ggctccatgg cttagggcaa catatctatg 
aagccataat aagaattctg caacaactgc 
catagcagaa taggcgttac tcgacagagg 
actagagccc tggaagcatc caggaagtca 
taaaaagtgt tgctttcatt gccaagtttg 
tggcaggaag aagcggagac agcgacgaag 
ttctctatca aagcagtaag tagtacatgt 
agcattagta gtagcaataa taatagcaat 
taggaaaata ttaagacaaa gaaaaataga 
agaagacagt ggcaatgaga gtgaaggaga 
gatggggcac catgctcctt gggatgttga 
tcacagtcta ttatggggta cctgtgtgga 
cagatgctaa agcatatgat acagaggtac 
ccacagaccc caacccacaa gaagtagtat 
ggaaaaatga catggtagaa cagatgcatg 
taaagccatg tgtaaaatta accccactct 
atgatactaa taccaatagt agtagcggga 
actgctcttt caatatcagc acaagcataa 
tttataaact tgatataata ccaatagata 
gtaacacctc agtcattaca caggcctgtc 
attattgtgc cccggctggt tttgcgattc 
caggaccatg tacaaatgtc agcacagtac 
caactcaact gctgttaaat ggcagtctag 
atttcacgga caatgctaaa accataatag 
gtacaagacc caacaacaat acaagaaaaa 
catttgttac aataggaaaa ataggaaata 
caaaatggaa taacacttta aaacagatag 
ataaaacaat aatctttaag caatcctcag 
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7321 
7381 
7441 
7501 
7561 
7621 
7681 
7741 
7801 
7861 
7921 
7981 
8041 
8101 
8161 
8221 
8281 
8341 
8401 
8461 
8521 
8581 
8641 
8701 
8761 
8821 
8881 
8941 
9001 
9061 
9121 
9181 
9241 
9301 
9361 
9421 
9481 
9541 
9601 
9661 



gaggggaccc 
attcaacaca 
ataacactga 
tgtggcagaa 
catcaaatat 
agatcttcag 
ataaagtagt 
tgcagagaga 
caggaagcac 
ctggtatagt 
tgcaactcac 
acctaaagga 
ctgctgtgcc 
cgacctggat 
ttgaagaatc 
gggcaagttt 

taatgatagt 
atagagttag 
gacccgacag 
ttcgattagt 
tcagctacca 
gacgcagggg 
aactaaagaa 
cagatagggt 
gaataagaca 
agtgtgattg 
agggtgggag 
gcagctacca 
ccagtcacac 
cactttttaa 
atccttgatc 
ccagggccag 
gagccagata 
agcctgcatg 
ctagcatttc 
cgagcttgct 
actggggagt 
gtctctctgg 
gcttaagcct 
tgactctggt 



agaaattgta 
actgtttaat 
aggaagtgac 
agtaggaaaa 
tacagggctg 
acctggagga 
aaaaattgaa 
aaaaagagca 
tatgggcgca 
gcagcagcag 
agtctggggc 
tcaacagctc 
ttggaatgct 
ggagtgggac 
gcaaaaccag 
gtggaattgg 
aggaggcttg 
gcagggatat 
gcccgaagga 
gaacggatcc 
ccgcttgaga 
gtgggaagcc 
tagtgctgtt 
tatagaagta 
gggcttggaa 
gatggcctac 
cagcatctcg 
atgctgcttg 
ctcaggtacc 
aagaaaaggg 
tgtggatcta 
gggtcagata 
agatagaaga 
ggatggatga 
atcacgtggc 
acaagggact 
ggcgagccct 
ttagaccaga 
caataaagct 
aactagagat 



acgcacagtt 
agtacttggt 
acaatcaccc 
gcaatgtatg 
ctattaacaa 
ggagatatga 
ccattaggag 
gtgggaatag 
gcctcaatga 
aacaatttgc 
atcaagcagc 
ctggggattt 
agttggagta 
agagaaatta 
caagaaaaga 
tttaacataa 
gtaggtttaa 
tcaccattat 
atagaagaag 
ttggcactta 
gacttactct 
ctcaaatatt 
agcttgctca 
gtacaaggag 
aggattttgc 
tgtaagggaa 
agacctggaa 
tgcctggcta 
tttaagacca 
gggactggaa 
ccacacacaa 
tccactgacc 
ggccaataaa 
cccggagaga 
ccgagagctg 
ttccgctggg 
cagatcctgc 
tctgagcctg 
tgccttgagt 
ccctcagacc 



ttaattgtgg 
ttaatagtac 
tcccatgcag 
cccctcccat 
gagatggtgg 
gggacaattg 
tagcacccac 
gagctttgtt 
cgctgacggt 
tgagggctat 
tccaggcaag 
ggggttgctc 
ataaatctct 
acaattacac 
atgaacaaga 
caaattggct 
gaatagtttt 
cgtttcagac 
aaggtggaga 
tctgggacga 
tgattgtaac 
ggtggaatct 
atgccacagc 
cttgtagagc 
tataagatgg 
agaatgagac 
aaacatggag 
gaagcacaag 
atgacttaca 
gggctaattc 
ggctacttcc 
tttggatggt 
ggagagaaca 
gaagtgttag 
catccggagt 
gactttccag 
atataagcag 
ggagctctct 
gcttcaagta 
cttttagtca 



aggggaattt 
ttggagtact 
aataaaacaa 
cagtggacaa 
taatagcaac 
gagaagtgaa 
caaggcaaag 
ccttgggttc 
acaggccaga 
tgaggcgcaa 
aatcctggct 
tggaaaactc 
ggaacagatt 
aagcttaata 
attattggaa 
gtggtatata 
tgctgtactt 
ccacctccca 
gagagacaga 
tctgcggagc 
gaggattgtg 
cctacagtat 
catagcagta 
tattcgccac 
gtggcaagtg 
gagctgagcc 
caatcacaag 
aggaggagga 
aggcagctgt 
actcccaaag 
ctgattagca 
gctacaagct 
ccagcttgtt 
agtggaggtt 
acttcaagaa 
ggaggcgtgg 
ctgctttttg 
ggctaactag 
gtgtgtgccc 
gtgtggaaaa 



ttctactgta 
gaagggtcaa 
attataaaca 
attagatgtt 
aatgagtccg 
ttatataaat 
agaagagtgg 
ttgggagcag 
caattattgt 
cagcatctgt 
gtggaaagat 
atttgcacca 
tggaatcaca 
cactccttaa 
ttagataaat 
aaattattca 
tctatagtga 
accccgaggg 
gacagatcca 
ctgtgcctct 
gaacttctgg 
tggagtcagg 
gctgagggga 
atacctagaa 
gtcaaaaagt 
agcagcagat 
tagcaataca 
ggtgggtttt 
agatcttagc 
aagacaagat 
gaactacaca 
agtaccagtt 
acaccctgtg 
tgacagccgc 
ctgctgacat 
cctgggcggg 
cctgtactgg 
ggaacccact 
gtctgttgtg 
tctctagca 



The oligonucleotide probe for Ap-1 sequence in step 63, of original application, 
was synthesized using PMA responsive element as consensus sequence as indicated 
by the reference of Northrop et al . , 1993, and adding flanking sequences. 

These two sequences which were used as probes are representative example to 
demonstrate the methodology of DNA-protein interaction. Any other relevant 
sequence (s) can be used for this purpose as Claim 4 has been amended by the 
Examiner . 
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